#Adaptive Quantization Noise16/10/2025
QeRL Unlocks 32B RL Training on One H100 with NVFP4, Faster Rollouts and Better Exploration
'QeRL uses NVFP4 weight quantization plus LoRA and AQN to boost rollout throughput and exploration, allowing a 32B policy to be trained on a single H100 with competitive accuracy.'